Inferring Decision Trees Using the Minimum Description Length Principle

نویسندگان

  • J. Ross Quinlan
  • Ronald L. Rivest
چکیده

This paper concerns methods for inferring decision trees from examples for classification problems. The reader who is unfamiliar with this problem may wish to consult J. R. Quinlan’s paper (1986), or the excellent monograph by Breiman et al. (1984), although this paper will be self-contained. This work is inspired by Rissanen’s work on the Minimum description length principle (or MDLP for short) and on his related notion of the stochastic complexity of a string Rissanen, 1986b. The reader may also want to refer to related work by Boulton and Wallace (1968, 1973a, 1973b), Georgeff and Wallace (1984), and Hart (1987). Roughly speaking, the minimum description length principle states that the best “theory” to infer from a set of data is the one which minimizes the sum of

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inferring Reduced Ordered Decision Graphs of Minimum Description Length

We propose an heuristic algorithm that induces decision graphs from training sets using Rissanen's minimum description length principle to control the tradeoo between accuracy in the training set and complexity of the hypothesis description.

متن کامل

Context Maximizing : Finding MDL Decision Trees

We present an application of the context weighting algorithm. Our objective is to classify objects with decision trees. The best tree will be searched for with the Minimum Description Length Principle. In order to find these trees, we modified the context weighting algorithm.

متن کامل

Causal Inference on Multivariate and Mixed-Type Data

Given data over the joint distribution of two random variables X and Y , we consider the problem of inferring the most likely causal direction between X and Y . In particular, we consider the general case where both X and Y may be univariate or multivariate, and of the same or mixed data types. We take an information theoretic approach, based on Kolmogorov complexity, from which it follows that...

متن کامل

Causal Inference on Multivariate Mixed Type Data

Given data over the joint distribution of two univariate or multivariate random variables X and Y of mixed or single type data, we consider the problem of inferring the most likely causal direction between X and Y . We take an information theoretic approach, from which it follows that €rst describing the data over cause and then that of e‚ect given cause is shorter than the reverse direction. F...

متن کامل

Attribute Value Selection Considering the Minimum Description Length Approach and Feature Granularity

In this paper we introduce a new approach to automatic attribute and granularity selection for building optimum regression trees. The method is based on the minimum description length principle (MDL) and aspects of granular computing. The approach is verified by giving an example using a data set which is extracted and preprocessed from an operational information system of the Components Toolsh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Comput.

دوره 80  شماره 

صفحات  -

تاریخ انتشار 1989